Picture for Yunfang Wu

Yunfang Wu

Key Laboratory of Computational Linguistics, Ministry of Education, China, School of Computer Science, Peking University, China

ActTraitBench: Quantifying the Knowledge-Decision Gap in Large Language Models via Human-Grounded Behavioral Validation

Add code
May 28, 2026
Viaarxiv icon

ADWIN: Adaptive Windows for Horizon-Aware On-Policy Distillation

Add code
May 27, 2026
Viaarxiv icon

Trait-Aware Policy Optimization for Autoregressive Multi-Trait Essay Scoring

Add code
May 26, 2026
Viaarxiv icon

RLVR Datasets and Where to Find Them: Tracing Data Lineage for Better Training Data

Add code
May 26, 2026
Viaarxiv icon

Tool Learning Needs Nothing More Than a Free 8B Language Model

Add code
Apr 20, 2026
Viaarxiv icon

Securing the Floor and Raising the Ceiling: A Merging-based Paradigm for Multi-modal Search Agents

Add code
Mar 02, 2026
Viaarxiv icon

ORBIT: On-policy Exploration-Exploitation for Controllable Multi-Budget Reasoning

Add code
Jan 13, 2026
Viaarxiv icon

Safety-Utility Conflicts Are Not Global: Surgical Alignment via Head-Level Diagnosis

Add code
Jan 07, 2026
Viaarxiv icon

SyncThink: A Training-Free Strategy to Align Inference Termination with Reasoning Saturation

Add code
Jan 07, 2026
Viaarxiv icon

One Tool Is Enough: Reinforcement Learning for Repository-Level LLM Agents

Add code
Dec 24, 2025
Viaarxiv icon